Goto

Collaborating Authors

 tensor factorization





Reviewers remark our method is intuitive and correct, and opens new directions in sparse clustering, while R1 raised

Neural Information Processing Systems

We thank you for commenting the paper is well-written and for finding a typo. Y ou suggest better baselines for comparison, citing Power k-means [37] and matrix + tensor factorization. See for instance " k-means Clustering Is Matrix Factorization" (Bauckhage 2015). We thank you for your detailed comments and careful reading of the paper. Y ou are absolutely correct that "interpretable sparsity" is overloaded here.




Duality-Induced Regularizer for Tensor Factorization Based Knowledge Graph Completion Supplementary Material

Neural Information Processing Systems

Theorem 1. Suppose that ˆ X In DB models, the commonly used p is either 1 or 2. When p = 2, DURA takes the form as the one in Equation (8) in the main text. If p = 1, we cannot expand the squared score function of the associated DB models as in Equation (4). Therefore, we choose p = 2 . 2 Table 2: Hyperparameters found by grid search. Suppose that k is the number of triplets known to be true in the knowledge graph, n is the embedding dimension of entities. That is to say, the computational complexity of weighted DURA is the same as the weighted squared Frobenius norm regularizer.




Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Neural Information Processing Systems

Mathematically characterizing the implicit regularization induced by gradient-based optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may apply, and a standard test-bed for studying this prospect is matrix factorization (matrix completion via linear neural networks).